Active Learning Based Corpus Annotation
نویسندگان
چکیده
Opinion Mining aims to automatically acquire useful opinioned information and knowledge in subjective texts. Research of Chinese Opinioned Mining requires the support of annotated corpus for Chinese opinioned-subjective texts. To facilitate the work of corpus annotators, this paper implements an active learning based annotation tool for Chinese opinioned elements which can identify topic, sentiment, and opinion holder in a sentence automatically.
منابع مشابه
Modeling the Annotation Process for Ancient Corpus Creation
In corpus creation human annotation is expensive. Annotation costs can be minimized through machine learning and active learning, however there are many complex interactions among the machine learner, the active learning technique, the annotation cost, human annotation accuracy, the annotator user interface, and several other elements of the process. For example, we show that changing the way i...
متن کاملDesigning an active learning based system for corpus annotation
In this paper we review some Active Learning experimental results in order to set up the basis for designing an active learning based system for corpus annotation. Based on the experimental data we design a modular system that allows for initially learning fast, but that it is capable of switching to a slower and more precise learning strategy. The system is designed to perform a semantic role ...
متن کاملMorphological Annotation of a Large Spontaneous Speech Corpus in Japanese
We propose an efficient framework for humanaided morphological annotation of a large spontaneous speech corpus such as the Corpus of Spontaneous Japanese. In this framework, even when word units have several definitions in a given corpus, and not all words are found in a dictionary or in a training corpus, we can morphologically analyze the given corpus with high accuracy and low labor costs by...
متن کاملApproximating Learning Curves for Active-Learning-Driven Annotation
Active learning (AL) is getting more and more popular as a methodology to considerably reduce the annotation effort when building training material for statistical learning methods for various NLP tasks. A crucial issue rarely addressed, however, is when to actually stop the annotation process to profit from the savings in efforts. This question is tightly related to estimating the classifier p...
متن کاملVideo Corpus Annotation Using Active Learning
Concept indexing in multimedia libraries is very useful for users searching and browsing but it is a very challenging research problem as well. Beyond the systems’ implementations issues, semantic indexing is strongly dependent upon the size and quality of the training examples. In this paper, we describe the collaborative annotation system used to annotate the High Level Features (HLF) in the ...
متن کامل